Defining parameters for homology-tolerant database searching.
نویسندگان
چکیده
De novo interpretation of tandem mass spectrometry (MS/MS) spectra provides sequences for searching protein databases when limited sequence information is present in the database. Our objective was to define a strategy for this type of homology-tolerant database search. Homology searches, using MS-Homology software, were conducted with 20, 10, or 5 of the most abundant peptides from 9 proteins, based either on precursor trigger intensity or on total ion current, and allowing for 50%, 30%, or 10% mismatch in the search. Protein scores were corrected by subtracting a threshold score that was calculated from random peptides. The highest (p < .01) corrected protein scores (i.e., above the threshold) were obtained by submitting 20 peptides and allowing 30% mismatch. Using these criteria, protein identification based on ion mass searching using MS/MS data (i.e., Mascot) was compared with that obtained using homology search. The highest-ranking protein was the same using Mascot, homology search using the 20 most intense peptides, or homology search using all peptides, for 63.4% of 112 spots from two-dimensional polyacrylamide gel electrophoresis gels. For these proteins, the percent coverage was greatest using Mascot compared with the use of all or just the 20 most intense peptides in a homology search (25.1%, 18.3%, and 10.6%, respectively). Finally, 35% of de novo sequences completely matched the corresponding known amino acid sequence of the matching peptide. This percentage increased when the search was limited to the 20 most intense peptides (44.0%). After identifying the protein using MS-Homology, a peptide mass search may increase the percent coverage of the protein identified.
منابع مشابه
Effective Query Filtering for Fast Homology Searching
To improve the accuracy of rapid homology searching it is common practice to filter all queries to mask low complexity regions prior to searching. We show in this paper, through a large-scale study of querying the PIR database, that applying popular filtering techniques unselectively to all queries may reduce retrieval effectiveness. We also show that masking queries with our new technique, caf...
متن کاملIndexing Genomic Databases for Fast Homology Searching
Genomic sequence databases has been widely used by molecular biologists for homology searching. However, as amino acid and nucleotide databases are growing in size at an alarming rate, traditional brute force approach of comparing a query sequence against each of the database sequences is becoming prohibitively expensive. In this paper, we re-examine the problem of searching for homology in lar...
متن کاملSuperior performance in protein homology detection with the Blocks Database servers
The Blocks Database World Wide Web (http://www.blocks.fhcrc.org ) and Email ([email protected]) servers provide tools for the detection and analysis of protein homology based on alignment blocks representing conserved regions of proteins. During the past year, searching has been augmented by supplementation of the Blocks Database with blocks from the Prints Database, for a total of 4754 b...
متن کاملRSDB: representative protein sequence databases have high information content
MOTIVATION Biological sequence databases are highly redundant for two main reasons: 1. various databanks keep redundant sequences with many identical and nearly identical sequences 2. natural sequences often have high sequence identities due to gene duplication. We wanted to know how many sequences can be removed before the databases start losing homology information. Can a database of sequence...
متن کاملMAPanalyzer: a novel online tool for analyzing microtubule-associated proteins
The wide functional impacts of microtubules are unleashed and controlled by a battery of microtubule-associated proteins (MAPs). Specialists in the field appreciate the diversity of known MAPs and propel the identifications of novel MAPs. By contrast, there is neither specific database to record known MAPs, nor MAP predictor that can facilitate the discovery of potential MAPs. We here report th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of biomolecular techniques : JBT
دوره 15 4 شماره
صفحات -
تاریخ انتشار 2004